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AMENDMENTS TO THE CLAIMS 

This listing of claims will replace all prior versions, and listings, of claims 
in the application: 

Listing of Claims: 



1 1 . (Previously Presented) A method for characterizing a document with 

2 respect to clusters of conceptually related words, comprising: 

3 receiving the document, wherein the document contains a set of words; 

4 selecting candidate clusters of conceptually related words that are related 

5 to the set of words; 

6 wherein the candidate clusters are selected using a model that explains 

7 how sets of words are generated from clusters of conceptually related words, 

8 wherein the conceptually related words are words that relate to a common topic; 

9 and 

1 0 constructing a set of components to characterize the document, wherein 

1 1 the set of components includes components for candidate clusters, wherein each 

1 2 component indicates a degree to which a corresponding candidate cluster is 

1 3 related to the set of words, 

14 wherein the set of components provides an abstract representation for the 

1 5 document, wherein the abstract representation is subsequently used as a substitute 

1 6 for the document during query operations involving the document. 

1 2. (Original) The method of claim 1 , wherein the model is a probabilistic 

2 model, which contains nodes representing random variables for words and for 

3 clusters of conceptually related words. 
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1 3. (Original) The method of claim 2, wherein each component in the set of 

2 components indicates a degree to which a corresponding candidate cluster is 

3 active in generating the set of words. 

1 4. (Currently Amended) The method of claim 3, 

2 wherein nodes in the probabilistic model are coupled together by weighted 

3 links; and 

4 wherein tf- a firing of a cluster node in the probabilistic model4*fes? 

5 activates a weighted link from the cluster node to another node ean-to cause the 

6 other node to fire. 

1 5. (Currently Amended) The method of claim 4, wherein tf-for a node 

2 which h as multiple parent nodes that are active, the probability that the node does 

3 not fire is the product of the probabilities that links from the active parent nodes 

4 do not fire. 

1 6. (Original) The method of claim 2, wherein the probabilistic model 

2 includes a universal node that is always active and that has weighted links to all 

3 cluster nodes. 

1 7. (Original) The method of claim 4, wherein selecting the candidate 

2 clusters involves: 

3 constructing an evidence tree by starting with terminal nodes associated 

4 with the set of words in the document, and following links in the reverse direction 

5 to parent cluster nodes; 

6 using the evidence tree to estimate a likelihood that each parent cluster 

7 * node was active in generating the set of words; and 



4 

XXX W:\Oooglfi\OGL-007 1 -00-US\TrttlimAinendmoiit.doc 



PAGE 6/21 * RCVD AT 8/6/2007 3:30:50 PM [Eastern Daylight Time] * SVR:USPTO-EFXRF-5/5 * DNIS:2738300 * CS!D:+1 530 750 1665 * DURATION (mm-ss): 07-24 



08/06/2007 13:01 +1-530-759-1665 PVF PAGE 07/21 



8 selecting a parent cluster node to be a candidate cluster node based on its 

9 estimated likelihood. 

1 8. (Original) The method of claim 7, wherein estimating the likelihood that 

2 a given parent node is active in generating the set of words may involve 

3 considering: 

4 the unconditional probability that the given parent node is active; 

5 conditional probabilities that the given parent node is active assuming 

6 parent nodes of the given parent node are active; and 

7 conditional probabilities that the given parent node is active assuming 

8 child nodes of the given parent node are active. 

1 9. (Original) The method of claim 8, wherein considering the conditional 

2 probabilities involves considering weights on links between nodes. 

1 1 0. (Original) The method of claim 7 wherein estimating the likelihood 

2 that a given parent node is active in generating the set of words involves marking 

3 terminal nodes during the estimation process to ensure that terminal nodes are not 

4 factored into the estimation more than once. 

1 11. (Original) The method of claim 7, wherein constructing the evidence 

2 tree involves pruning unlikely nodes from the evidence tree. 

1 12. (Original) The method of claim 3, wherein during construction of the 

2 set of components, the degree to which a candidate cluster is active in generating 

3 the set of words is determined by calculating a probability that a candidate cluster 

4 is active in generating the set of words. 
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1 13. (Original) The method of claim 3, wherein during construction of the 

2 set of components, the degree to which a candidate cluster is active in generating 

3 the set of words is determined by multiplying a probability that a candidate cluster 

4 is active in generating the set of words by an activation for the candidate cluster, 

5 wherein the activation indicates how many links from the candidate cluster to 

6 other nodes are likely to fire. 



1 14. (Original) The method of claim 1 , wherein constructing the set of 

2 components involves normalizing the set of components. 

1 15. (Original) The method of claim 3, wherein constructing the set of 

2 components involves approximating a probability that a given candidate cluster is 

3 active over states of the probabilistic model that could have generated the set of 

4 words. 

1 16. (Original) The method of claim 15, wherein approximating the 

2 probability involves: 

3 selecting states for the probabilistic model that are likely to have generated 

4 the set of words in the document; and 

5 considering only selected states while calculating the probability that the 

6 given candidate cluster is active. 



1 17. (Original) The method of claim 16, wherein selecting a state that is 

2 likely to have generated the set of words involves: 

3 randomly selecting a starting state for the probabilistic model; and 

4 performing hill-climbing operations beginning at the starting state to reach 

5 a state that is likely to have generated the set of words. 
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1 18. (Original) The method of claim 17, wherein performing the hill- 

2 climbing operations involves periodically changing states of individual candidate 

3 clusters without regards to an objective function for the hill-climbing operations 

4 to explore states of the probabilistic model that are otherwise unreachable through 

5 hill-climbing operations. 



1 19. (Original) The method of claim 1 8, wherein changing a state of an 

2 individual candidate cluster involves temporarily fixing the changed state to 

3 produce a local optimum for the objective function, which includes the changed 

4 state. 

1 20. (Original) The method of claim 1, wherein the document can include; 

2 a web page; or 

3 a set of terms from a query. 

1 21. (Currently amended) A computer-readable storage medium storing 



2 instructions that when e x e cut e d by a computer cause the-a^computer to perform a 

3 method for characterizing a document with respect to clusters of conceptually 

4 related words , wh e rein the computer readable storag e medium ia ono of a disk 

5 drive, a magnetic tap e , a CDs (compact discs), and a DVDs (digital versatile disc 

6 or digital vid e o disc) , the method comprising: 

7 receiving the document, wherein the document contains a set of words; 

8 selecting candidate clusters of conceptually related words that are related 

9 to the set of words, wherein the conceptually related words are words that relate to 
10 a common topic; 

J 1 wherein the candidate clusters are selected using a model that explains 

1 2 how sets of words are generated from clusters of conceptually related words; and 
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1 3 constructing a set of components to characterize the document, wherein 

14 the set of components includes components for candidate clusters, wherein each 

1 5 component indicates a degree to which a corresponding candidate cluster is 

1 6 related to the set of words, 
17 

1 8 wherein the set of components provides an abstract representation for the 

19 document, wherein the abstract representation is subsequently used as a substitute 

20 for the document during query operations involving the document. 

1 22. (Original) The computer-readable storage medium of claim 21, 

2 wherein the model is a probabilistic model, which contains nodes representing 

3 random variables for words and for clusters of conceptually related words. 

1 23. (Original) The computer-readable storage medium of claim 22, 

2 wherein each component in the set of components indicates a degree to which a 

3 corresponding candidate cluster is active in generating the set of words. 

1 . 24. (Currently Amended) The computer-readable storage medium of claim 

2 23, 

3 wherein nodes in the probabilistic model are coupled together by weighted 

4 links; and 

5 wherein tf -a firing of a cluster node in the probabilistic model fir e s, 

6 activates a weighted link from the cluster node to another node eaft-to cause the 

7 other node to fire. 

1 25. (Currently Amended) The computer-readable storage medium of claim 

2 24, wherein i£fora node which has multiple parent nodes that are active, the 
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3 probability that the node does not fire is the product of the probabilities that links 

4 from the active parent nodes do not fire. 



1 26. (Original) The computer-readable storage medium of claim 22, 

2 wherein the probabilistic model includes a universal node that is always active 

3 and that has weighted links to all cluster nodes. 

1 27. (Original) The computer-readable storage medium of claim 24, 

2 wherein selecting the candidate clusters involves: 

3 constructing an evidence tree by starting with terminal nodes associated 

4 with the set of words in the document, and following links in the reverse direction 

5 to parent cluster nodes; 

6 using the evidence tree to estimate a likelihood that each parent cluster 

7 node was active in generating the set of words; and 

8 selecting a parent cluster node to be a candidate cluster node based on its 

9 estimated likelihood. 

1 28. (Original) The computer-readable storage medium of claim 27, 

2 wherein estimating the likelihood that a given parent node is active in generating 

3 the set of words may involve considering: 

4 the unconditional probability that the given parent node is active; 

5 conditional probabilities that the given parent node is active assuming 

6 parent nodes of the given parent node are active; and 

7 conditional probabilities that the given parent node is active assuming 

8 child nodes of the given parent node are active. 
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1 29. (Original) The computer-readable storage medium of claim 28, 

2 wherein considering the conditional probabilities involves considering weights on 

3 links between nodes. 

1 30. (Original) The computer-readable storage medium of claim 27, 

2 wherein estimating the likelihood that a given parent node is active involves 

3 marking terminal nodes during the estimation process to ensure that terminal 

4 nodes are not factored into the estimation more than once. 



1 3 L (Original) The computer-readable storage medium of claim 27, 

2 wherein constructing the evidence tree involves pruning unlikely nodes from the 

3 evidence tree. 

1 32. (Original) The computer-readable storage medium of claim 23, 

2 wherein during construction of the set of components, the degree to which a 

3 candidate cluster is active in generating the set of words is determined by 

4 calculating a probability that a candidate cluster is active in generating the set of 

5 words. 

1 33. (Original) The computer-readable storage medium of claim 23, 

2 wherein during construction of the set of components, the degree to which a 

3 candidate cluster is active in generating the set of words is determined by 

4 multiplying a probability that a candidate cluster is active in generating the set of 

5 words by an activation for the candidate cluster, wherein the activation indicates 

6 how many links from the candidate cluster to other nodes are likely to fire. 
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1 34. (Original) The computer-readable storage medium of claim 2 1 , 

2 wherein constructing the set of components involves normalizing the set of 

3 components. 

1 35. (Original) The computer-readable storage medium of claim 23, 

2 wherein constructing the set of components involves approximating a probability 

3 that a given candidate cluster is active over states of the probabilistic model that 

4 could have generated the set of words. 

1 36. (Original) The computer-readable storage medium of claim 35, 

2 wherein approximating the probability involves: 

3 selecting states for the probabilistic model that are likely to have generated 

4 the set of words in the document; and 

5 considering only selected states while calculating the probability that the 

6 given candidate cluster is active. 

1 37. (Original) The computer-readable storage medium of claim 36, 

2 wherein selecting a state that is likely to have generated the set of words involves: 

3 randomly selecting a starting state for the probabilistic model; and 

4 performing hill-climbing operations beginning at the starting state to reach 

5 a state that is likely to have generated the set of words. 

1 38. (Original) The computer-readable storage medium of claim 37, 

2 wherein performing the hill-climbing operations involves periodically changing 

3 states of individual candidate clusters without regards to an objective function for 

4 the hill-climbing operations to explore states of the probabilistic model that are 

5 otherwise unreachable through hill-climbing operations. 
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1 39. (Original) The computer-readable storage medium of claim 38, 

2 wherein changing a state of an individual candidate cluster involves temporarily 

3 fixing the changed state to produce a local optimum for the objective function, 

4 which includes the changed state. 

1 40, (Original) The computer-readable storage medium of claim 2 1 , 

2 wherein the document can include: 

3 a web page; or 

4 a set of terms from a query. 

1 41 . (Previously Presented) An apparatus for characterizing a document 

2 with respect to clusters of conceptually related words, comprising: 

3 a receiving mechanism, configured to receive the document, wherein the 

4 document contains a set of words; 

5 a selection mechanism configured to select candidate clusters of 

6 conceptually related words that are related to the set of words; 

7 wherein the candidate clusters are selected using a model that explains 

8 how sets of words are generated from clusters of conceptually related words, 

9 wherein the conceptually related words are words that relate to a common topic; 

10 and 

1 1 a component construction mechanism configured to construct a set of 

1 2 components to characterize the document, wherein the set of components includes 

1 3 components for candidate clusters, wherein each component indicates a degree to 

1 4 which a corresponding candidate cluster is related to the set of words, 

1 5 wherein the set of components provides an abstract representation for the 

16 document, wherein the abstract representation is subsequently used as a substitute 

1 7 for the document during query operations involving the document. 
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1 42. (Original) The apparatus of claim 41 , wherein the model is a 

2 probabilistic model, which contains nodes representing random variables for 

3 words and for clusters of conceptually related words. 

1 43. (Original) The apparatus of claim 42, wherein each component in the 

2 set of components indicates a degree to which a corresponding candidate cluster is 

3 active in generating the set of words. 

1 44. (Currently Amended) The apparatus of claim 43, 

2 wherein nodes in the probabilistic model are coupled together by weighted 

3 links; and 

4 wherein * £a firing of a cluster node in the probabilistic model 

5 ^reactivates a weighted link from the cluster node to another node eaa-causes 

6 the other node to fire. 



1 45. (Currently Amended) The apparatus of claim 44, wherein if-fora node 

2 whichhas multiple parent nodes that are active, the probability that the node does 

3 not fire is the product of the probabilities that links from the active parent nodes 

4 do not fire. 

1 46. (Original) The apparatus of claim 43, wherein the probabilistic model 

2 includes a universal node that is always active and that has weighted links to all 

3 cluster nodes. 

1 47. (Original) The apparatus of claim 44, wherein the selection mechanism 

2 is configured to: 
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3 construct an evidence tree by starting with terminal nodes associated with 

4 the set of words in the document, and following links in the reverse direction to 

5 parent cluster nodes; 

6 use the evidence tree to estimate a likelihood that each parent cluster node 

7 was active in generating the set of words; and to 

8 select a parent cluster node to be a candidate cluster node based on its 

9 estimated likelihood. 

1 48. (Original) The apparatus of claim 47, wherein while estimating the 

2 likelihood that a given parent node is active in generating the set of words, the 

3 selection mechanism is configured to consider at least one of the following: 

4 the unconditional probability that the given parent node is active; 

5 conditional probabilities that the given parent node is active assuming 

6 parent nodes of the given parent node are active; and 

7 conditional probabilities that the given parent node is active assuming 

8 child nodes of the given parent node are active. 



1 49. (Original) The apparatus of claim 48, wherein while considering the 

2 conditional probabilities, the selection mechanism is configured to consider 

3 weights on links between nodes. 

1 50. (Original) The apparatus of claim 47, wherein while estimating the 

2 likelihood that a given parent node is active in generating the set of words, the 

3 selection mechanism is configure to mark terminal nodes during the estimation 

4 process to ensure that terminal nodes are not factored into the estimation more 

5 than once. 
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1 51. (Original) The apparatus of claim 47, wherein while constructing the 

2 evidence tree, the selection mechanism is configured to prune unlikely nodes from 

3 the evidence tree. 

1 52. (Original) The apparatus of claim 43, wherein while constructing a 

2 given component in the set of components, the component construction 

3 mechanism is configured to determine the degree to which a candidate cluster is 

4 active in generating the set of words by calculating a probability that a candidate 

5 cluster is active in generating the set of words. 

1 53. (Original) The apparatus of claim 43, wherein while constructing a 

2 given component in the set of components, the component construction 

3 mechanism is configured to determine the degree to which a candidate cluster is 



4 active in generating the set of words by multiplying a probability that a candidate 

5 cluster is active in generating the set of words by an activation for the candidate 

6 cluster, wherein the activation indicates how many links from the candidate 

7 cluster to other nodes are likely to fire. 



1 54. (Original) The apparatus of claim 41, wherein the component 

2 construction mechanism is configured to normalize the set of components. 

1 55. (Original) The apparatus of claim 43, wherein the component 

2 construction mechanism is configured to approximate a probability that a given 

3 candidate cluster is active over states of the probabilistic model that could have 

4 generated the set of words. 

1 56. (Original) The apparatus of claim 55, wherein while approximating the 

2 probability, the component construction mechanism is configured to: 
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3 select states for the probabilistic model that are likely to have generated 

4 the set of words in the document; and to 

5 consider only selected states while calculating the probability that the 

6 given candidate cluster is active. 

1 57. (Original) The apparatus of claim 56, wherein while selecting a state 

2 that is likely to have generated the set of words, the component construction 

3 mechanism is configured to: 

4 randomly select a starting state for the probabilistic model; and to 

5 perform hill-climbing operations beginning at the starting state to reach a 

6 state that is likely to have generated the set of words. 

1 58. (Previously Presented) The apparatus of claim 57, wherein while 

2 performing the hill-climbing operations, the component construction mechanism 

3 is configured to periodically change states of individual candidate clusters without 

4 regards to an objective function for the hill-climbing operations to explore states 

5 of the probabilistic model that are otherwise unreachable through hill-climbing 

6 operations. 

1 59. (Original) The apparatus of claim 58, wherein while changing a state 

2 of an individual candidate cluster, the component construction mechanism is 

3 configured to temporarily fix the changed state to produce a local optimum for the 

4 objective function, which includes the changed state. 

1 60. (Original) The apparatus of claim 41 , wherein the document can 

2 include: 

3 a web page; or 

4 a set of terms from a query. 

16 

XXX WAGoo&le\aGL-0071 -00-US\PrclimAmendnient.doc 



PACE 18/21 ■ RCVD AT 8/6/2007 3:30:59 PM [Eastern Daylight Time] * SVR:USPTO-EFXRF-5/5 * DNIS:2738300 " CS1D:+1 530 759 1665 ■ DURATION <mm-ss):07-24 



08/86/2007 13:01 +1-530-759-1665 



PVF 



PAGE 19/21 



1 61. (Canceled). 

1 62. (Canceled). 
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